perm filename CDOC[CMP,LSP] blob sn#208723 filedate 1976-03-31 generic text, type C, neo UTF8
COMMENT āŠ—   VALID 00006 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00002 00002		A Brief Description of the Stanford Lisp Compiler
C00005 00003			Fundamental Table Dispatch Mechanism
C00009 00004				Top Level
C00013 00005				Pass One
C00017 00006				Pass Two
C00019 ENDMK
CāŠ—;
	A Brief Description of the Stanford Lisp Compiler

	This document will attempt to give a brief explanation of the
fundamental structure and behavior of  the  Stanford  Lisp  compiler.
Throughout  the  operation  of the Lisp compiler, as described in the
Lisp 1.6 manual will be assumed known.  In all explicit  descriptions
below,  the  compiler will be taken to be operation on a disk file to
produce another disk  file,  as  this  is  the  fundamental  mode  of
operation.

			Organisation

	The compiler is divided into three major parts.     Toplevel,
the first of  these  is  concerned  with  processing  user  commands,
handling  files  and  in general providing the framework within which
actual compiling takes place.  The remaining two parts are  concerned
with  compilation  itself.       The  first,  Pass1, preprocesses the
expression, putting certain things in normal forms, expanding  macros
and recording information in tables.  The second, Pass2, operating on
the output of  Pass1,  preforms  the  final  code  generation.     In
addition,  there  are  verious  subsidiary  pieces  dealing  with IO,
debugging and general subroutine support. The three major pieces will
be  discussed  after a brief digression to consider a basic mechanism
which is used at several places in the compiler.


		Fundamental Table Dispatch Mechanism

	It is  a  common  process  in  Lisp  programs  to  process  a
S-expression  with an atomic car by checking for certain set atoms or
certain classes of atoms and calling routines corresponding to  these
various  cases.     For  example,  a  routine  to evaluate arithmetic
expressions might, after first check for an atomic  expression,  test
to  see  if car of the expression were one of plus, times etc.    and
calling routines to add or multiply as  appropriate.  In  a  slightly
more  complicated arithmetic interpreter, there might also be certain
classes.  Sin, Cosine and Tangent might all lead to the invocation of
the  same  routine  for  processing  trigonometric functions.     The
disadvantages  of  this  technique  are  twofold.     First,  it   is
inflexible,  and  second,  it is, in certain ways, inefficient.   The
latter is a problem of growth.     Though  the  tests  are  not  time
consuming,  the  total time increases with their number.    Much more
important, is the fact that when adding or modifying the behavior  of
a process written in this way, the original code must be altered even
for a pure addtion.  This may necessitate recompiling or  other  time
consuming  processes.   The  technique  which  will  now be described
attempts to systematize the process of dispatching on the car  of  an
expression in such a way as to remove these disadvantages.
	Given a non-atomic  expression  with  an  atomic  car  to  be
processed,  our basic intention is to search the property list of the
car for some property of interest. Rather than having a list  of  the
properties   which   are   of  interest,  however,  the  emphasis  on
flexibility  can  be  pushed   still   furtther   by   distinguishing
interesting  properties by examining their own property lists.   Thus
given an atom which is car of an expression, each  indicator  on  its
property  list  is  examined  to  see  if  it  in  turn has a special
property.  This last property is not a seat of flexibility but occurs
explicitly   in   the  code.    The  routine  in  the  compler  which
accomplishes this is called GETGET.  In the following sections, three
explicit cases of its use will be discussed.



			Top Level

	When the COMPL command is given, causing the compilation of a
file,   the  compiler  must  process  each  expression  in  the  file
appropriately.        The  file  may  contain  functions  defined  by
S-expressions,  functions  defined by Lap, declarations affecting the
compiler, expressions to be evaluated at  load  time,  comments  etc.
Each  of  these things must be processed differently.    The function
definitions must be compiled into Lap, the Lap must  be  examined  to
note  the types of the functions it defines, the declarations must be
processed for their intended effects, the runtime S-expressions  must
be  printed  out  unchanged,  and  the  comments must be absorbed and
ignored. Further, all of these things must be done in such a way that
changes and additions may be easily made.
	After all of the overhead of deciphering the  user's  command
and  preparing  for  a file processing operation has been put by, the
compiler  settles down to considering  each expression  in the  input
file to decide on appropriate action.  The routine which does this is
called COMPEFFECT.  If the expression is atomic, it is simply printed
and  no  further  action  is  taken.  If it is not atomic, but has an
atomic  car,  the  function  GETGET  is  called  with  the   property
COMPEFFECT  in  mind.   If  the car of the expression has a property,
which  in  turn  has  the  COMPEFFECT  property   then   the   entire
S-expression  is  handed  over  to the function which is the value of
this COMPEFFECT property. At present, in the case of  the  top  level
there  are  only  two  atoms which have the COMPEFFECT property.  The
first of these is the atom MACRO. Thus if car of an expression  is  a
macro,  then  control  will be given to the function ACTONMACRO which
will be found as the CCOMPEFFECT property of  the  atom  MACRO.   The
function  ACTONMACRO  will  then  make use of the macro definition to
expand the macro and then apply the function ACTONEXPR to the result.
The other distinguished property which will be uncovered by ACTONEXPR
is COMPACTION.  This property indicates that the atom in question  is
to  be dealt with specially.  The value of the COMPEFFECT property of
the atom COMPACTION is a simple function which finds  the  COMPACTION
property of the original atom and hands it control.


			Pass One

	Pass  one  is  the  first part of compiling proper.  As noted
above, it will put the expression in normal  form  by  such  acts  as
expanding macros and  recording information for the use of the second
pass.  As  with  the  top  level,  control  is  passed  around  among
functions  which  occur  as  properties  of  various  atoms  and  are
discovered by the operation of GETGET. The property given  to  GETGET
in  this  case  is  PASS1.    Several  atoms  have the PASS1 property
including SUBR, FSUBR, MACRO  and  their  relatives.    As  with  the
COMPEFFECTs  of  the  top  level, there is a property which indicates
that an expression is to receive special treatment.  This property is
P1.
	The information kept by  pass  one  is  primarily  about  the
variables  occurring in the  expression.  Variables are  divided into
two classes,  SPECIAL and  LOCAL.  Into the  former go all  variables
whether  bound  or  not  which  are  know  to the compiler as special
variables from either declarations or prior  free  occurrence.    The
local  variables  are kept on a separate list and certain information
is kept about them.
	During  the  process of compilation, it is often necessary to
refer to quantities which are old values of variables.   Rather  than
assigning  to each such quantity a separate name and treating is as a
peculiar  sort  of  variable,  the  compiler  keeps  a  count  during
compilation  which  is  used to  record the progress of events.  This
count which is  set  to  zero  at  the  beginning  of  each  pass  is
incremented each time that any variable is referenced in any way, and
at certain other places.  It  is  desired  to  guarantee  that  if  a
variable takes on two different values, it may not do so at one value
of the count. The count thus serves the function of a sort of  "time"
within the compilation.
	The first pass, in addition to maintaining  the  count,  will
record  for  each local variable the last value of the count at which
it was seen.  Thus if the count during the second  pass  should  ever
become  larger  that  that  associated  with  some variable, then the
variable will never be used again.  This allows the space it occupied
to be used for something else.


			Pass Two

	The  fundamental processes of the compiler take place in pass
two. Most of the complexity of this section is due  to  a  desire  to
exploit  the architecture of the PDP-10 by placing function arguments
in the accumulators.   In order to avoid  the  orgy  of  pushing  and
popping the most frequent action in which the compiler must engage is
that of compiling the arguments to  a  function,  then  applying  the
function to the arguments.  If this were done in the straight forward
way in which it would be done if arguments were placed on the  stack,
the  need to maintain transparancy would result in an orgy of pushing
and popping  which  would  paralyse  the  computation.   Instead  the
compiler follows a more sophisticated and vastly more confusing plan.
It compiles in  turn  each  argument  without  loading  it  into  any
particualar  place.   This  amounts  to guaranteeing that the desired
result exists somewhere in the machine and making note that it may be
moved  but  not  damaged. Finally, after all of a functions arguments
have been compiled, they are loaded into the first  few  accumulators
and the funtion is called.